Trying to mimic human segmentation of speech using HMM and fuzzy logic post-correction rules
نویسندگان
چکیده
The process of human segmentation and labelling of speech can be seen as a two-step process. In the first step humans listen to a speech signal, recognize the word and phoneme sequence, and roughly determine the position of each phonetic boundary. In the second step humans examine several speech signal features (waveform, energy, spectrogram, etc.) to place a phonetic boundary time mark where these features best satisfy a certain set of conditions specific for that kind of phonetic boundary. In this paper an automatic two-stage system for phonetic segmentation and labelling of speech is presented. This system tries to mimic the two-step process of human segmentation and labelling of speech. The first stage of the system is a contextdependent phonetic HMM recognizer that yields the recognized phoneme sequence and a set of rough phonetic boundary time marks. The second stage extracts several speech signal features that are intended to be the counterpart of those examined by humans. These features are used to refine each rough time mark obtained in the first stage. Each time mark is moved to a near position where the degree of truthfulness of a certain set of fuzzy logic conditions (specific for that kind of phonetic boundary) is maximum. These fuzzy logic conditions are intended to be the counterpart of the conditions tested by humans.
منابع مشابه
Speech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملAutomatic segmentation of speech based on hidden Markov models and acoustic features
An accurate database segmented and labeled at phonetic, subword or word level is very important for speech research. However, manual segmentation and labeling is a time consuming and error prone task. This paper describes an automatic procedure for the segmentation of speech in a set of acoustic sub-words units: given either the linguistic or the phonetic content of a speech utterance, the syst...
متن کاملAn Approach for Accident Forecasting Using Fuzzy Logic Rules: A Case Mining of Lift Truck Accident Forecasting in One of the Iranian Car Manufacturers
Fuzzy Logic is one of the concepts that has created different scientific attitudes by entering into various professional fields nowadays and in some cases has made remarkable effects on the results of the practical researches. However, the existence of stochastic and uncertain situations in risk and accident field, affects the possibility of the forecasting and preventing the occurrence of the ...
متن کاملRobust Potato Color Image Segmentation using Adaptive Fuzzy Inference System
Potato image segmentation is an important part of image-based potato defect detection. This paper presents a robust potato color image segmentation through a combination of a fuzzy rule based system, an image thresholding based on Genetic Algorithm (GA) optimization and morphological operators. The proposed potato color image segmentation is robust against variation of background, distance and ...
متن کاملColor Image Edge Detection Using Fuzzy Membership Functions
Digital image processing is widely used in many research oriented fields. Edge detection method is one of the important techniques in Image Segmentation, which is used to find out the objects in the input image in exact manner. An edge is the boundary between an object and background and it indicates the boundary between overlapping objects. One of the most commonly used operation analysis is e...
متن کامل